Data Imbalance Problem solving for SMOTE Based Oversampling: Study on Fault Detection Prediction Model in Semiconductor Manufacturing Process

نویسندگان

  • Jaekwon Kim
  • Youngshin Han
  • Jongsik Lee
چکیده

Fault detection prediction of FAB (wafer fabrication) process in semiconductor manufacturing process is possible that improve product quality and reliability in accordance with the classification performance. However, FAB process is sometimes due to a fault occurs. And mostly it occurs “pass”. Hence, data imbalance occurs in the pass/fail class. If the data imbalance occurs, prediction models are difficult to predict “fail” class because increases the bias of majority class (pass class). In this paper, we propose the SMOTE (Synthetic Minority Oversampling Technique) based over sampling method for solving problem of data imbalance. The proposed method solve the imbalance of the between pass and fail by oversampling the minority class of fail. In addition, by applying the fault detection prediction model to measure the performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extracting Predictor Variables to Construct Breast Cancer Survivability Model with Class Imbalance Problem

Application of data mining methods as a decision support system has a great benefit to predict survival of new patients. It also has a great potential for health researchers to investigate the relationship between risk factors and cancer survival. But due to the imbalanced nature of datasets associated with breast cancer survival, the accuracy of survival prognosis models is a challenging issue...

متن کامل

Evaluation of Classifiers in Software Fault-Proneness Prediction

Reliability of software counts on its fault-prone modules. This means that the less software consists of fault-prone units the more we may trust it. Therefore, if we are able to predict the number of fault-prone modules of software, it will be possible to judge the software reliability. In predicting software fault-prone modules, one of the contributing features is software metric by which one ...

متن کامل

Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning

In recent years, mining with imbalanced data sets receives more and more attentions in both theoretical and practical aspects. This paper introduces the importance of imbalanced data sets and their broad application domains in data mining, and then summarizes the evaluation metrics and the existing methods to evaluate and solve the imbalance problem. Synthetic minority oversampling technique (S...

متن کامل

Oversampling Method for Imbalanced Classification

Classification problem for imbalanced datasets is pervasive in a lot of data mining domains. Imbalanced classification has been a hot topic in the academic community. From data level to algorithm level, a lot of solutions have been proposed to tackle the problems resulted from imbalanced datasets. SMOTE is the most popular data-level method and a lot of derivations based on it are developed to ...

متن کامل

Consolidation Process: An Alternative to Intelligent Oversampling Methods in Class Imbalance Problems

In real world problems solved using data mining techniques, it is very usual to find data in which the number of examples of one of the classes is much smaller than the number of examples of the rest of the classes. Many works have been done to deal with these problems known as class imbalance problems. Most of them focus their effort on data resampling techniques so that training data would be...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016